Streaming Dictionary Matching with Mismatches

نویسندگان

چکیده

In the k-mismatch problem we are given a pattern of length n and text must find all locations where Hamming distance between is at most k. A series recent breakthroughs have resulted in an ultra-efficient streaming algorithm for this that requires only $$\mathcal {O}(k \log \frac{n}{k})$$ space {O}(\log \frac{n}{k} (\sqrt{k k} + ^3 n))$$ time per letter (Clifford, Kociumaka, Porat, SODA 2019). work, consider strictly harder called dictionary matching with k mismatches. problem, d patterns, each n, substrings within from one patterns. We develop ^k \mathop {\mathrm {polylog} {\,n}})$$ ^{k} {\,n}} |\mathrm {output}|)$$ position text. The randomised outputs correct answers high probability. On lower bound side, show any mismatches $$\varOmega (k d)$$ bits space.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Practical Index for Approximate Dictionary Matching with Few Mismatches

Approximate dictionary matching is a classic string matching problem, with applications in, e.g., online catalogs, geolocation (mapping possibly misspelled location description to geocoordinates), web searchers, etc. We present a surprisingly simple solution, based on the Dirichlet principle, for matching a keyword with few mismatches and experimentally show that it offers competitive space-tim...

متن کامل

Streaming Periodicity with Mismatches

We study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n− p differs from its suffix of length n− p in at most k locations. We give a one-pass streaming algorithm that computes the k-periods of a string S using poly(k, logn) bits of space, for k-periods of length at most n 2 . We also present a two-pa...

متن کامل

Parameterized matching with mismatches

The problem of approximate parameterized string searching consists of finding, for a given text t = t1t2 . . . tn and pattern p = p1p2 . . . pm over respective alphabets Σt and Σp , the injection πi from Σp to Σt maximizing the number of matches between πi(p) and ti ti+1 . . . ti+m−1 (i = 1,2, . . . , n −m + 1). We examine the special case where both strings are run-length encoded, and further ...

متن کامل

Fast String Matching with Mismatches

We describe and analyze three simple and fast algorithms on the average for solving the problem of string matching with a bounded number of mismatches. These are the naive algorithm, an algorithm based on the Boyer-Moore approach, and ad-hoc deterministic nite automata searching. We include simulation results that compare these algorithms to previous works.

متن کامل

On String Matching with Mismatches

In this paper, we consider several variants of the pattern matching with mismatches problem. In particular, given a text T = t1t2 · · · tn and a pattern P = p1p2 · · · pm, we investigate the following problems: (1) pattern matching with mismatches: for every i, 1 ≤ i ≤ n −m + 1 output, the distance between P and titi+1 · · · ti+m−1; and (2) pattern matching with k mismatches: output those posit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Algorithmica

سال: 2021

ISSN: ['1432-0541', '0178-4617']

DOI: https://doi.org/10.1007/s00453-021-00876-x